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Implicit  enumeration  algorithm  of  integer  programming  on  ILLIAC  IV 
li   Introduction 

In  this  paper,  let  us  consider  the  following  integer  programming 
problem; 

minimize  n 

3-1  J  J  (1) 


n 
subject  to        a.  +  Z  a   v  ^  n 

10      ,:-,     U  Xj  -  °  (2) 


J=l 


i  =  1,  2. 


,  m, 


where  c  >  0  for  i    -  1      o  „    j  ■, 

j  -       J  "  L>   2>  '">    n>  and  where  each  variable  x.  assumes 

only  non-negative  integer  values.   Furthermore,  the  value  of 'each  x. 
W  be  assumed  to  be  limited  to  1  or  0  only,  without  loss  of  generality 
because  any  non-zero  integer  may  be  expressed  as 

Xk2k  +  Xk-l2k"1+--.  +-121+x() 
with  variables  xR,  x^,  ...,  Xq  whlch  assume  the  value^  ±   or  Q< 

This  programming  problem  has  been  attracting  intense  attention 
from  people  in  the  various  fields,  because  a  number  of  problems  which 
can  not  be  formulated  as  linear  programming  problem  can  be 
formulated  as  integer  programming  problems.   Those  problems  include 
knapsack  problems,  traveling  salesman  problems,  covering  problems,  job- 
shop  -scheduling  problems,  personnel-assignment  problems,  fixed-cost 

problems  and  so  forth  [-2]  [4] 

lorth.        Logical  design  problems  of  optimum  switching 

networks  also  have  been  formulated  as  integer  programming  problems.  ^Hl3][3] 

The  integer  programming  problems  appear  far  more  difficult  than 

the  linear  programming  problems,  judging  from  the  available  reports[5][7][l0][3^ 


on  the  computational  experience,  though  a  considerable  number  of  algorithms 
have  been  proposed  to  date.   These  algorithms  except  Gomory's  are  completely 
different  from  the  simplex  method  for  linear  programming. 

Probably  the  most  well  known  and  tested  algorithms  are  Gomory's 

ro-l  ,.        n  Jl][8][6][10] 

cutting  plane  method1 yj  and  the  implicit  enumeration  approach 

In  this  paper,  we  will  consider  the  implementation  of  the  implicit 
enumeration  algorithm  on  ILLIAC  IV  computer,  to  evaluate  the  effectiveness 
of  parallel  computation  available  with  ILLIAC  IV.   Although  only  the  case 
of  a  dense  matrix  (i.e.  a  matrix  with  many  non-zero  elements)  is  considered 
in  this  report,  the  speed-up  expected  from  the  parallel  computation  appears 
remarkable,  and  encourages  further  investigation  in  this  respect. 

The  implicit  enumeration  algorithm  have  a  few  different  versions. 
T.  Ibaraki,  T.  K.  Liu,  C.  R.  Baugh  and  S.  Muroga[l°  improved  these 
versions  (except  Geoffrion's  latest  work[T]).   The  program  code  based  on 
their  version  was  named  ILLIP[ll]  (iLLinois  Integer  Programming).   The 
implicit  enumeration  considered  here  is  the  simplified  version  of  that  of 
[10].   Therefore  all  the  proofs  and  further  detailed  discussion  will  be 
found  in  or  derived  from  the  argument  in  [10]. 

Typically  it  is  estimated  that  the  entire  computation  is  speeded 
up  by  about  100  times  by  using  the  parallel  computation  scheme  provided 
by  ILLIAC  iv,  compared  with  the  serial  computation  available  with  an 
ordinary  commercial  computer.   In  the  above  factor,  the  speed-up  due  to 
the  improvement  of  switching  time  of  logic  gates  is  not  taken  into  considerate 
since  the  exact  figure  about  the  hardware  speed-up  does  not  seem  available  yet. 


•   Outline  of  Implicit  Enumeration 

Let  us  begin  with  several  definitions,   Any  x,  whose  coordinates 
are  0  or  1,  is  called  a  solution.  A  solution  that  satisfies  constrants(2) 
is  -lied  a  feasible  solution,  and  a  feasible  solution  that  minimizes 
e"  •  x  is  called  an  optimal  (feasible)  solution.   A  partial  solution  S  is 
defined  as  an  assignment  of  the  values,  one  and  zero,  to  a  subset  of  n 
variables.   Any  variable  not  assigned  a  value  in  S  is  called  free.  The 
set  of  indices  of  free  variables  is  denoted  by  F(s). 

We  adopt  the  notational  convention  that  the  symbol  3   denotes 
Xj  =  1  and  the  symbol  -j  denotes  x.  =  0.   Hence,  if  n  =  5  and  S  =  (3,5,  -2), 
x3  -1,  x5=l  x^-O,  and  ^  and  x^  are  free.   The  order  in  which  the  elements 
of  S  are  written  will  be  used  to  represent  the  order  in  which  the  variables 
corresponding  to  the  elements  are  set  to  1  or  0  by  the  algorithm.  A 
completion  of  a  partial  solution  S  is  defined  as  a  solution  that  is  derived 
from  S  by  specifing  all  free  variables. 

The  whole  scheme  of  implicit  enumeration  is  illustrated  in  Fig.  1. 
Given  a  current  partial  solntion  S,  subroutine  CHK-IEQ  detects  some  of 
the  following  three  S -conditions: 

S^condition^Jai  S  forces  free  variables  to  be  specified  in  order 

to  obtain  feasible  completions  of  the  present 
partial  solution  S, 

SIcondition__Cb)   there  is  no  feasible  completion  of  S  or  even  if 

there  are  some,  they  are  not  better  (smaller  ' 
c  •  x  than  the  incumbant)  feasible  completion 
oi  o,    or 

S^condition__^c2  the  best  feasible  completion  of  3  ls  found  and 

it  is  better  than  the  incumbent. 


Fig.  1  Implicit  Enumeration  Algorithm 
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Is  there  a  better 
feasible  solution  than 
the  incumbent? 


eplace  the  incumben; 
by  the  new  better 
feasible  solution. 


Backtrack: 
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Locate  the  right  most  element  of  S  which 
not  underlined.   If  none  «^''*^^' 
otherwise,  change  the  sign  of  *e  element, 
underline  it  and  delete  all  elements  to 
the  right.  


If  S  forces  some  variables  to  be  specified  (S-condition  (a)), 
we  augment  S  by  the  variable  specified  to  that  value  with  underline.   In 
case  of  S-condition  (b),  we  simply  backtrack.   However,  if  S-condition 
(c)  occurs,  we  replace  the  incumbent  with  this  feasible  completion  before 
backtracking.   Finally,  if  none  of  S-conditions  are  detected,  we  go  to 
ACM-VAK  to  augment  the  partial  solution  by  a  certain  variable  which  is 
selected  according  to  the  rule  which  will  be  explained  later,  without 
underline. 

In  our  calculation,  the  objective  function  value  is  taken  care 
of  by  an  extra  inequality: 

z  -  1  -  c  x  >  0  ,  (3)* 

where  z  is  the  objective  function  value  for  the  incumbent.   The  value  of 
:  is  updated  whenever  the  incumbent  is  replaced.   Then  if  we  consider 
inequality  (3)  together  with  (2),  any  feasible  solution  is  better  than 
the  incumbent,  thereby  eliminating  in  the  program  checking  of  whether  the 
obtained  feasible  solution  is  better  than  the  incumbent  or  not. 

The  detection  of  the  S-conditions  is  done  by  making  use  of  the 
following  three  columns. 

y,(s)  =  S  a.  .  x. 

1  10  3 

\(S)    =  y,(S)  +   E   a.  . 

JeF(S)  1J 

a.  .X) 
ij 

^(S)  =  y.(S)  +   E   a,. 


jeF(s)  1J 
a.  <0 
ij 


0,  1,  2,  ...,  m, 


Assumes  that  we  want  to  obtain  a  single  optimum  solution  for  a  given 
integer  programming  problem. 

5 


where  i  -  0  stands  for  the  inequality  defined  hy  (3).  Note  that  y.(S)  is 
the  lefthand  side  value  of  the  i  th  inequality  for  the  present  value  of 
the  partial  solution  and  u.  (s)  and  ^(S)  are  the  maximum  and  the  mini*- 

-,4.-    «-p  q  T.p«mectivelv-  Then  follow  the  next 
values  of  y  (S)  over  all  completions  of  S,  respectively 

properties  "which  is  used  for  the  detection  of  the  S-conditions  in 


CHK-IEQ. 

(1) 


(2) 


If  u  (S)  <  0  for  some  i,  then  the  i-th  inequality 

i 
cannot  he  satisfied  for  any  completion  of  S. 

This  hnplies  that  the  present  partial  solution 

S  is  infeasihle. 

If  |  a.  .|  >  u.(S)  and  x.  is  free,  then 


x  =  0  must  hold  if  a   <  0 
J  J 

and   x.  =  1  must  hold  if  a  .  >  0. 
(3)    If  2.(S)  >  0  for  some  i,  the  i-th  inequality 
can  never  assume  a  negative  value  for  any 
completion  of  S.   Therefore  we  can  eliminate 
the  i-th  inequality  from  further  consideration 
as  far  as  the  completion  of  S  is  concerned,  thus 
reducing  the  matrix  size  we  are  working  on. 
(U)    If  y(S)  >  0,  then  the  completion  obtained  from  S 

•  by  setting  all  free  variables  to  0  is  feasible  and  is 
the  best  among  all  the  feasible  completions  of  S. 
For  detailed  discussion,  see  the  reference  [10]. 
The  backtracking  procedure  shown  in  Fig.  1  is  the  same  as  that 
proposed  by  Glover  [8].   An  outline  of  the  procedure  may  be  understood 
from  the  description  in  Fig.  1. 


Subroutine  AGMT-VAR  in  our  program  works  as  follows.   This  is  a 
simplified  version  of  the  method  discussed  in  [10].   Let  us  prepare  a  row 
called  p  in  which  Pj|  for  each  jeP,  represents  the  number  of  inequalities 
which  satisfies  L(S')   >  0,  where  S'  is  S  with  x.  =  1  augmented.   Among 
all  p..,  jeF,  let  p..   assume  the  maximum  value.   Then  AGMT-VAR  augments 
the  current  partial  solution  S  with  x   =  1,  resulting  in  a  new  partial 
solution,  for  which  the  check  of  the  S-conditions  by  CHK-IEQ  will  be 
subsequently  applied. 

Any  scheme  of  AGMT-VAR  will  let  the  algorithm  converge  in  a 
finite  number  of  iterations.   Here  the  above  scheme  is  selected  because 
it  seems  reasonably  efficient  in  spite  of  its  simplicity,  as  anticipated 
from  the  computational  experience  in  [10]. 

In  addition  to  these  subroutines,  CHK-IEQ,  backtrack  and  AGMT-VAR, 
another  subroutine  called  UD-YLU*  is  included,  as  shown  in  Fig.  2.   This 
UD-YLU  updates  the  value  of  y,  T  and  u  whenever  the  partial  solution  is 
modified.   Therefore  this  subroutine  is  called  from  each  of  the  above 
subroutines,  CHK-IEQ,  backtrack  and  AGMT-VAR,   Upon  entering  UD-YLU,  the 
new  values  of  y,  %    u  are  calculated,  and  the  feasibility  (y  >  0)  or  the 
infeasibility  (u.  <  0  for  some  i)  of  the  partial  solution  is  detected  in 
this  subroutine.   Usually  after  the  execution  of  UD-YLU  the  program  control 
returns  to  the  subroutine  from  which  UD-YLU  was  called  but  in  the  feasible 
or  infeasible  case,  the  execution  of  backtrack  follows  (after  replacing 
the  incumbent  if  feasible.). 

The  whole  scheme  is  illustrated  in  Fig.  2.  More  detailed 
description  of  each  individual  subroutine  will  be  given  in  the  next  section, 
using  flow  charts  for  the  illustration  of  the  implementation  on  ILLIAC  IV. 

UD-YLU  is  scattered  in  other  subroutines  in  Fig.  1.   But  for  the  sake 
of  ease  m  programming,  we  make  it  a^  independent  subroutine. 
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Fig.   2.        Linkage  of  subroutines 
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3.   Implicit  Enumeration  on  ILLIAC  IV 

In  this  section,  first  the  overall  storage  allocation  is  discussed. 
Then  using  this  allocation  scheme,  each  subroutine  is  described  with  a 
flow  chart.   Actual  number  of  each  instruction  used  in  the  procedure  is 
counted  for  each  of  both  parallel  (Illiae  IV)  and  serial  computation 
(an  ordinary  computer)  cases.   An  estimation  of  speed  difference  between 
these  two  cases  will  be  summarized  in  the  next  section. 

For  the  sake  of  simplicity,  consider  a  problem  with  1000  variables 
and  256  inequalities.   The  density  of  non-zero  coefficients  of  the  given 
problem  is  high  and  hence  no  packing  scheme  of  data  is  employed. 

Fig.  3  illustrates  the  storage  allocation  scheme  in  ILLIAC  IV. 
Each  inequality  is  assigned  to  1  to  100U  of  each  PEM  (i.e.  Processing 
Element  Memory)  and  data  processing  of  each  inequality  is  assigned  to  each 
PE  (i.e.  Processing  Element.)  As  illustrated,  locations  1,  2,  3  of  PE 
serve  as  u,  1,   y  respectively,  whereas  locations  1*  ~  lool,  are  used  for 
accomodating  the  coefficients  of  problem  including  those  of  the  objective 

function.   Besides  such  entries  «  ,7  7  r 

ucn  entries  as  u,  i,   y  and  coefficients  of  the  given 

problem,  four  kinds  of  the  arrays  p,  x,  s,  opt  (the  last  three  of  these 
will  be  defined  in  the  following.)  are  also  provided.  These  arrays  are 
allocated  in  the  locations  of  1005  through  1020.   Each  of  these  four 
has  1000  entries  so  that  each  entry  may  correspond  to  a  variable  of  the 
Siven  problem,  x  keeps  the  value  of  the  current  partial  solution  in 
such  a  way  that  the  value  of  x.  is  stored  in  the  J-th  entry  of  £   The 
entries  of  *  are  counted  from  the  first  entry  of  the  first  row  to  the 
last  entry  of  the  fourth  row  of  x,  as  shown  in  Eig.  3.  (The  shaded  areas 
-  Fig.  3  are  empty.)   s  also  represents  the  current  partial  solution  in  a 
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Fig.    3.        Storage  Allocation 
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different  way  from  x.   Each  assignment  of  variable  is  entered  from  the 

beginning  of  s  according  to  the  order  of  their  generation  in  the  course 

of  the  computation.   Note  that  the  packing  of  s  start  from  the  right  hand 

side  and  from  the  uppermost  row,  as  shown  in  Fig.  3.   In  other  words,  the 

array  s"  is  used  to  keep  track  of  the  enumeration  status  and  to  operate  the 

backtracking  procedure  on  it.   p  is  used  to  store  the  parameters  calculated 

in  AGMT-VAR  as  explained  in  Section  2.   opt  contains  the  incumbent  solution. 

The  information  of  the  value  of  x .  is  also  stored  in  the  j-th  location 

J 

of  p  and  opt,  in  the  same  way  as  x. 

With  these  preparations,  the  detailed  description  of  each  subroutine 
is  given  next.   Note,  however,  that  this  report  is  a  preliminary  study 
and  more  efficient  program  might  be  conceivable  by  utilizing  the  capability 
of  ILLIAC  IV  more  carefully. 
3-1  CHK-IEQ 

Fig.  k   is  a  flow  chart  of  subroutine  CHK-IEQ.   Roughly  speaking, 
it  checks  whether  there  exists  a. .  such  that  la. . i  >  u.  for  every  free 

variable  x..   If  the  condition  is  satisfied  by  some  free  variable,  the 

J 

variables  are  set  to  0  or  1  according  to  the  sign  of  a. .,  followed  by 

-*-t) 

the  call  of  UD-YLU  to  update  y,  I   and  u. 

The  speedup  by  the  parallel  computation  is  mainly  due  to  the 
portion  where  the  condition  la. .  >  u.  and  the  sign  of  a. .  is  checked 
(marked  as  A  in  Fig.  k) .      For  example  the  existence  of  | a.  .  |  >  u.  and 
a.  .  <  0  can  be  found  by  a  procedure  shown  in  Fig.  5.   First  the  sum  of 
two  arrays  u  and  a.  is  retained  in  all  registers  of  256  PE's.   Then  the 
sign  bits  of  these  256  registers  are  sent  to  a  Mode  register,  by  which 
the  existence  of  negative  entries  in  all  the  above  256  registers  is 
checked  by  one  instruction.   This  process  needs  1L  and  1A  operations* 


*  In  the  following,  L  stands  for  Load,  A  for  Add,  S  for  Subtract, 
ST  for  STore  and  R  for  Routing. 
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Fig.  5.   Detailed  description  of  portion  A  in  Fig.  k. 
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whereas  the  serial  computation  needs  256  L  and  A  and  also  256  sign 
checks.   We  count  the  number  of  instructions  L,  ST,  A,  S,  R  only,  since 
these  consume  most  of  execution  time. 

Consequently  the  total  number  of  instructions  used  to  perform  the 

procedure  in  Fig.  h   for  each  x  is  given  by 

J 

(L,S), 


if  x .  is  not  free,  and 


(hh,   a,  3S) 


if  a  free  variable  x.  is  set  to  neither  0  nor  1.   If  x.  is  set  to  0, 

(kL,    3ST,  2A,  2S) 

steps  are  necessary  and  if  x .  is  set  to  1, 

J 

(5L,  3ST,  2A,  3S) 

steps.   The  above  numbers  assume  the  parallel  computation. 

On  the  other  hand,  if  the  serial  computation  is  adopted,  for 

each  free  x., 
3 

(L,S) 
(51UL,  256A,  258S), 

(259L,  3ST,  25TA,  2S) 

and    (515L,  3ST,  25TA,  258S) 

steps  are  needed  corresponding  to  the  four  cases  in  which  x..  is  not 

free,  x  is  specified  to  0  and  1,  and  no  assignment  is  performed  on 
3 

x.,  respectively. 
J 

Ik 


3-2      AGMT-VAR 

AGMT-VAR  works   as   shown   in  Fig.    6.      As   roughly  explained  in 
Section  2,    ACM-VAR  first  calculates  the  parameter  array  p,    and  then 
picks  up  the  variable  x.     which  has  the  maximum  value   in  p.      Then  x 

0  J'o 

is  set  to  1.   After  updating  y,  %    u  accordingly,  CHK-IEQ  is  entered  to 

check  the  new  partial  solution. 

The  portion  B  indicated  in  Fig.  6  is  shown  in  more  detail  in 
Fig.  7.   First,  assuming  that  a  free  variable  x.   is  set  to  1,  the  new 
value  of  %    i.e.,  J.,  is  calculated.   The  s±gn  Qf  ^   ^    .g  ^  ^^ 

(denoted  by  C  in  Fig.  7)  and  then  put  in  another  register  in  each  PE. 
This  process  can  be  done  by  first  transfering  the  sign  bits  of 
registers  containing  the  values  of  V.    to  one  of  the  mode  register.   Then 
the  contents  of  the  mode  register  is  distributed  back  to  each  register 
in  the  corresponding  PE. 

In  the  next  block  D,  the  contents  of  256  registers  are  summed  up 
to  count  the  number  of  i?'  with  non-negative  values.   This  is  done  by 
using  the  "logarithmic  sum"  scheme  known  among  ILLIAC  IV  people.   In 
other  words,  a  number  in  each  odd-numbered  PE  added  to  a  number  of  the 
PE  on  its  right  by  the  first  instruction.  And  then  addition  of  each 
adjacent  pair  of  sums  is  repeated  until  the  total  sum  is  formed  at  the 
right  most  PE.   To  obtain  the  total  of  256  PE%  (8A,  U8K)  instructions 
are  necessary. 

The  rest  of  the  flow  chart  is  self-explanatory. 
An  estimation  of  the  number  of  major  instructions  necessary  to 
perform  the  work  in  portion  B  is  as  follows: 
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Fig.  7-    Portion  B  of  AGMT-VAB. 
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(1)  When  x.  is  free; 

(11,  1ST,  9A,  U8R)   :   parallel  computation, 
and  (256L,  1ST,  512 A)    :   serial  computation, 

(2)  when  x.  is  specified; 

(11,  IS)   :   "both  parallel  and  serial  computations. 
After  obtaining  the  values  of  p  by  applying  the  process  to  all  the 
variables,  the  rest  of  the  flow  chart  needs 

(L,  3ST,  A,  9S,  kSR)        :   parallel  computation 
(257L,  2ST,  A,  25TS)    :   serial  computation. 

3-3  Backtrack 

Subroutine  Backtrack  is  entered  from  UD-YLU,  which  will  be  explained 
in  Section  3-k,    after  finding  the  feasibility  or  infeasibility.   Backtrack 
operates,  on  the  array  s  which  keeps  the  value  of  variables  in  the  current 
partial  solution  in  such  form  that  each  specified  variable  is  stacked 
from  the  beginning  of  s  in  the  order  of  variable  assignment 
made  during  the  computation,  either  with  underline  or  without  underline. 
If  the  value  of  a  variable  is  positive,  it  is  underlined,  and,  otherwise, 

not  underlined. 

As  shown  in  Fig.  8,  the  program  first  locates  the  last  negative 
(not  underlined)  entry.   Then  the  sign  of  the  entry  is  switched  to  positive, 
simultaneously  deleting  all  the  entries  which  follow  after  that  variable. 
Of  course,  this  is  the  restatement  of  the  operation  of  Backtrack  illustrated 

in  Fig.  1. 

Fig.    9  is   a  more   detailed  flow  chart  of  Backtrack,   which   should 

be   self-explanatory. 
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Fig.  8.    Backtrack 
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Now  assume  that,  on  the  average,  the  loop  in  Pig.  9  is  repeated 
2.5  ti.es  to  locate  the  last  negative  entry.   The  number  of  major 
instructions  used  to  go  through  this  subroutine  is; 

(3.5L,  2ST)   :   parallel  computation 

(6*aL,  2ST)   :   serial  computation. 
3-h     UD-YLU 

This  subroutine  shown  in  T?iff  in  ■?«       ^~. 

own  in  Fig.  10  ls  most  efficiently  speeded  up 

by  making  use  of  the  parallel  computation. 

Depending  on  the  situation  in  which  this  subroutine  is  called, 
there  are  six  cases; 

xr   °-*1'  <>->*>  i-o,  i-*,  *-i,  *-o, 

where  *  means  that  the  variable  is  not  specified  (free).   In  each  case, 
the  updating  procedure  of  y,  J   and  »  is  slightly  ^^  ^  ^ 
significantly.   For  example,  if  x.  is  changed  frQm  ±  ^  ^    ^  _  ^ 
V    and  w<  are  determined  from  the  present  values  y,    J  and  u,  by 


y\  =  y.  -  a. 


ij 


V.   = 


u:  = 
i 


*1- 

i 

a.  . 

,    if  a. .   >  o 

l'i 

>  if  a. .  <  o 
ij  - 

\      l 

;  if  a. .  >  o 
ij  - 

u±  -  a     if  a. .  <  o. 
1    1J       ij 


Other  cases  have  similar  changes. 


The  detailed  procedures  only  for  two  of  +h«  qT, 

ujf  ior  -cwo  of  the  above  six  cases  are 

illustrated  in  Fig.  io. 
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Besides  updating  y,  1,   u,  (labeled  as  "updating"  in  Fig.  10) 
this  subroutine  checks  the  infeasibility  and  feasibility  of  the  new 
partial  solution  by  observing  the  sign  of  the  new  u  and  y.   As  mentioned 
in  Section  2,  if  at  least  one  of  u  is  negative,  then  the  partial  solution 
is  infeasible.   In  addition,  if  y  >  0  is  satisfied,  then  the  partial 
solution  is  feasible.   The  exits,  feasible  and  infeasible,  indicate  these 
two  cases.   If  feasible  and  infeasible  cases  occur,  appropriate  actions 
are  taken,  though  they  are  not  described  in  detail  since  these  cases 
occur  very  rarely  in  the  entire  computation.   (in  most  of  time,  the 
program  control  leaves  UD-YLU  through  the  exit  "updating".) 

With  these  remarks,  the  flow  chart  of  Fig.  10  should  be  easily 

understood. 

The  number  of  major  instructions  used  in  a  branch  is 

(3L,  3ST,  3S)  :   parallel  computation 
256  (3L,  3ST,  3S)  :   serial  computation, 

where  those  cases,  feasible  and  infeasible,  are  assumed  not  to  occur. 
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Fig.  10.    UD-YLU 


X 

:l->  0 

\ 

< 

< 

, 

u       =  u  -   a. 

. 

Find  the 
right  branch 


i  ~ '  i  o     > 


yes 


no 


i1  =  i  -  a 


y   =  y  -  K 


Infeasible 


« ies_ 


Feasible 
yes 


y  '  >  o 


no 


Updating 


Load  a.   and 

J 
make  a  mask 
for  a.  .  <  0 


u.     =  u  -   a.(<0) 


"CEDEEO 


Complement  the 
above  mask 


J,  no 


1  =  1-   a.(X)) 


y    =  y  -  a . 


23 


k.      Estimation  of  Computation  Time  on  ILLIAC  IV 

Using  the  number  of  instructions  estimated  for  each  subroutine 
in  Section  3,  this  section  derives  an  overall  speed-up  due  to  the  parallel 
computation  capability  provided  by  ILLIAC  IV,  over  the  conventional 

serial  computation. 

Before  the  conclusion,  let  us  obtain  the  total  number  of  instructions 

necessary  for  going  through  CHK-IEQ  and  AGMT-VAR  in  a  typical  situation 

[10][3l 


derived  from  our  computational  experience 


.   In  Section  3,  the 


estimation  of  operations  pertiment  to  each  variable  x..  only,  not  the 
estimation  for  the  total  work  for  these  subroutines,  were  given. 

Now  assume  that,  before  entering  CHK-IEQ,  500  out  of  1000  variables 
are  free,  and  50  among  these  500  free  variables  are  specified  in  CHK-IEQ, 
25  variables  to  0  and  25  variables  to  1.   Then  the  total  number  of 
instructions  used  to  handle  1000  variables  is 

500  (L,S)  +  U50  (UL,A,3S)  +  25  (UL,  3ST,  2A,  2S) 
+  25  (5L,3ST,2A,3S) 

=   (2125L,    150ST,    600A,    1075S) 

in  case   of  the  parallel  computation.      On  the   other  hand,    the   serial 
computation  needs  the  following  number  of  instructions  under  the   same 

condition. 

1000    (L,S)    +  U50    (5lUL,3ST,257A,2S) 

+  25  (259L,3ST,257A,2S)  +25  (515L, 3ST,257A,258S) 
=   (250650L,150ST,128050A, 122600s). 

Adopting  the   same   assumption  that  500  variables   are  free,    the 
total  number  of  instructions  used  to  go  through  AGMI-VAR  is; 
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500   (L,ST,9A,1)8R)    +  500   (L,S) 

+        (L,3ST,A,9S,I+8R) 

=        (1001L,  503ST,  1+501A,  509S,2i|0i|8R) 

:  parallel  computation, 

500  (256L,ST,52J5A)  +  500  (L,S) 
+    (257L,2ST,A,257S) 

(128757L,502ST,256001A,757S) 

:      serial  computation. 
With  these  figures,    let  us  proceed  to  estimate  the  total  speedup 
in  the  entire  illicit  enumeration  procedure.      There  are  two  main  paths 
in  the  overall  flow  chart  shown  in  Fig.    1  : 

(a)  path  A      :      CHK-IEQ  -.  AGMT-VAR 

(b)  PathB     :      °HK-IE^  Metihle  -  Bachtra*. 

First  consider  path  A.      Under  the  assumption  that  50  free  variables 
are  specified,   we  will  call  UB-YLU  51  times  to  pass  through  path  A. 
The  total  number  of  instructions  used  in  CHK-IEQ,  AGMT-VAE  and  51  UD-YLU 

is 

(3259L,  806ST,  5101A,  1737S,24o1t8r) 

:   parallel  computation, 
( i+l8575L,  39820ST,  38>r051A,  162525s ) 

:   serial  computation. 
In  order  to  get  an  estimation  for  path  B,  let  us  further  assume, 
that  the  infeasibility  of  the  current  partial  solution  is  detected  in 
the  middle  of  CHK-IEQ,  i.e.,  after  25  free  variables  are  specified,  and 
then  Backtrack  is  initiated.  Also  assume  that  the  last  50  variables  in 
the  current  partial  solution  are  underlined.   Then  for  this  path  B,  a 
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half  of  CHK-IEQ,  one  Backtrack  and  lG   UD-YLU  are  used.   Note  that  we 
neglect  the  feasible  case  of  the  partial  solution  because  this  occurs 
very  rarely  in  the  usual  computation.   The  total  number  of  instructions 
used  in  path  B  is: 

( 129IL, 377ST, 300A, 763S ) 

:  parallel  computation, 

(183566L,57677ST,6U025A, 118900s) 

:   serial  computation. 

Taking  the  figures  presently  available  for  ILLIAC  IV,  200  nano  seconds 
for  each  of  L,  S,  A,  ST  and  80  nano  seconds  for  R,  let  us  consider  that 
the  execution  of  R  is  2.5  times  faster  than  others,  and  the  execution 
time  for  L,  ST,  A,  S  which  are  the  same,  is  used  as  a  unit  time. 
Then,  according  to  the  above  argument,  path  A  needs 

20522   unit  times  :  parallel 
100^971  unit  times  :  serial, 

while  path  B  needs 

2731   unit  times  :  parallel 

k2kl6S  unit  times    :    serial. 
Assuming  that  paths  A  and  B  occur  evenly  as  a  typical  example, 
the   speed-up  attained  by  making  use   of  the  parallel  processing  of  ILLIAC  IV 
is  about  102' times,    judging  from  the  unit  times  necessary  as   estimated 
above. 
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5  Conclusion 


This  report  presents  a  preliminary  consideration  in  order  to 
implement  the  tapliclt  enumeration  algoritta  of  integer  programing 
problem  on  the  ILLIAC  IV  computer,  which  enables  us  to  perform  the  parallel 
processing  to  some  extent.   The  tentative  result  shown  in  the  report 
may  indicate  that  the  implementation  of  the  algorithm  on  I1LIAC  iv  U 
worth  while,  since  about  100  times  speed-up  suggests  that  a  considerably 
large  size  problem,  compared  with  the  problem  solvable  by  the  serial 
computation,  can  be  solved  on  ILLIAC  IV. 

Although  a  variety  of  programing  gi^ieks  md   reorganization 
of  the  algorithm  may  make  the  program  more  efficient,  the  speed-up 
factor  is  believed  not  to  deviate  too  much  from  that  obtained  in  this 


report. 


It  should  he  noticed,  however,  that  the  type  of  problem  assumed 
in  the  discussion,  i.e.  the  size  of  matrix  and  the  dense  coefficient 
matrix,  is  undoubtedly  one  of  the  most  favorable  types  for  ILLIAC  IV 
Therefore  the  future  investigation  of  the  problems  of  sparse  matrices, 
in  which  non-zero  coefficients  are  packed  in  the  memory,  and  large  size 
problems,  (much  larger  than  256  inequalities  and  1000  variables)  in 
which  the  coefficient  matrix  has  to  be  folded  in  the  memory,  complicating 
the  indexing  procedure,  is  essential  to  evaluate  the  parallel  processing 
of  the  implicit  enumeration  algorithm. 

Also  considering  that  ILLIAC  IV  is  constructed  such  that  the  unit 
time  used  in  the  previous  section  is  much  faster  than  that  of  conventional 
-chines,  this  speedup  due  to  hard-ware  improvement  should  be  further  taken 
mto  account,  to  estimate  realistically  the  value  of  ILLIAC  iv. 
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It  is  hoped  that  this  preliminary  investigation  will  be  followed 
by  more  detailed  discussion,  possibly  by  actual  coding,  to  verify  the 
usefulness  of  ILLIAC  IV  in  this  area. 

The  authors  are  grateful  for  the  encouragement  and  support  of 

Professor  D.  Slotnick. 
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